We consider the inverse acoustic obstacle problem for sound-soft star-shaped obstacles in two dimensions wherein the boundary of the obstacle is determined from measurements of the scattered field at a collection of receivers outside the object. One of the standard approaches for solving this problem is to reformulate it as an optimization problem: finding the boundary of the domain that minimizes the $L^2$ distance between computed values of the scattered field and the given measurement data. The optimization problem is computationally challenging since the local set of convexity shrinks with increasing frequency and results in an increasing number of local minima in the vicinity of the true solution. In many practical experimental settings, low frequency measurements are unavailable due to limitations of the experimental setup or the sensors used for measurement. Thus, obtaining a good initial guess for the optimization problem plays a vital role in this environment. We present a neural network warm-start approach for solving the inverse scattering problem, where an initial guess for the optimization problem is obtained using a trained neural network. We demonstrate the effectiveness of our method with several numerical examples. For high frequency problems, this approach outperforms traditional iterative methods such as Gauss-Newton initialized without any prior (i.e., initialized using a unit circle), or initialized using the solution of a direct method such as the linear sampling method. The algorithm remains robust to noise in the scattered field measurements and also converges to the true solution for limited aperture data. However, the number of training samples required to train the neural network scales exponentially in frequency and the complexity of the obstacles considered. We conclude with a discussion of this phenomenon and potential directions for future research.
translated by 谷歌翻译
This work is concerned with solving neural network-based feedback controllers efficiently for optimal control problems. We first conduct a comparative study of two mainstream approaches: offline supervised learning and online direct policy optimization. Albeit the training part of the supervised learning approach is relatively easy, the success of the method heavily depends on the optimal control dataset generated by open-loop optimal control solvers. In contrast, direct optimization turns the optimal control problem into an optimization problem directly without any requirement of pre-computing, but the dynamics-related objective can be hard to optimize when the problem is complicated. Our results highlight the priority of offline supervised learning in terms of both optimality and training time. To overcome the main challenges, dataset, and optimization, in the two approaches respectively, we complement them and propose the Pre-train and Fine-tune strategy as a unified training paradigm for optimal feedback control, which further improves the performance and robustness significantly. Our code is available at https://github.com/yzhao98/DeepOptimalControl.
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
我们提出了一种高效,可靠和可解释的全球解决方案方法$ \ TEXTIT {基于深度学习的异构代理模型,DeepHAM} $的算法,用于求解具有聚合冲击的高尺寸异质剂模型。状态分布大致由一组最佳的广义时刻表示。深度神经网络用于近似值和策略函数,目标通过直接模拟路径进行优化。除了是一个准确的全球求解器,此方法还具有三种附加功能。首先,它是求解复杂的异质剂模型的计算上有效,并且不会遭受维度的诅咒。其次,它提供了对个人国家分布的一般和可解释的代表;这对于解决宏观经济学中的经典问题是否以及如何以及如何在宏观经济中的古代问题。第三,它尽可能容易地解决了受限效率问题,这使得这适用于研究具有聚集震动的异构性药剂模型的最佳货币和财政政策的新可能性。
translated by 谷歌翻译
translated by 谷歌翻译
由于难以在具有不确定环境中处理高维空间中的函数近似的难度,因此对增强学习(RL)的大多数现有的理论分析仅限于表格设置或线性模型。这项工作通过在一般的再现内核希尔伯特空间(RKHS)中分析RL,提供了新的挑战。我们考虑一个Markov决策过程的家庭$ \ mathcal {m} $,其中奖励功能位于RKHS的单位球中,过渡概率在给定的任意集中。我们通过分发不匹配$ \ delta _ {\ mathcal {m}}(\ epsilon)$来描述可允许的状态动作分配空间的复杂性,以响应RKHS中的扰动,以规模$ \ epsilon的扰动来描述禁用的状态动作分配空间的复杂性。 $。我们展示$ \ delta _ {\ mathcal {m}}(\ epsilon)$给出所有可能算法的错误的下限和两个特定算法的上限(适合奖励和拟合Q-ereration)的RL问题。因此,$ \ delta_ \ mathcal {m}(\ epsilon)$关于$ \ epsilon $衡量$ \ mathcal {m} $的难度。我们进一步提供了一些具体的示例,并讨论了$ \ delta _ {\ mathcal {m}}(\ epsilon)$衰减在这些例子中。作为副产品,我们表明,当奖励功能在高维RKHS中时,即使接到概率是已知的并且动作空间是有限的,仍然可以遭受维度的诅咒。
translated by 谷歌翻译
本文涉及高维度中经验措施的收敛。我们提出了一类新的指标,并表明在这样的指标下,融合不受维度的诅咒(COD)。这样的特征对于高维分析至关重要,并且与经典指标相反({\ it,例如,瓦斯泰尔距离)。所提出的指标源自最大平均差异,我们通过提出选择测试功能空间的特定标准来概括,以确保没有COD的属性。因此,我们将此类别称为广义最大平均差异(GMMD)。所选测试功能空间的示例包括复制的内核希尔伯特空间,巴伦空间和流动诱导的功能空间。提出了所提出的指标的三种应用:1。在随机变量的情况下,经验度量的收敛; 2. $ n $粒子系统的收敛到麦基·维拉索夫随机微分方程的解决方案; 3.构建$ \ varepsilon $ -NASH平衡,用于均质$ n $ - 玩家游戏的平均范围限制。作为副产品,我们证明,考虑到接近GMMD测量的目标分布和目标分布的一定表示,我们可以在Wasserstein距离和相对熵方面生成接近目标的分布。总体而言,我们表明,所提出的指标类是一种强大的工具,可以在没有COD的高维度中分析经验度量的收敛性。
translated by 谷歌翻译
translated by 谷歌翻译
Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality". This paper introduces a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using backward stochastic differential equations and the gradient of the unknown solution is approximated by neural networks, very much in the spirit of deep reinforcement learning with the gradient acting as the policy function. Numerical results on examples including the nonlinear Black-Scholes equation, the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation suggest that the proposed algorithm is quite effective in high dimensions, in terms of both accuracy and cost. This opens up new possibilities in economics, finance, operational research, and physics, by considering all participating agents, assets, resources, or particles together at the same time, instead of making ad hoc assumptions on their inter-relationships.
translated by 谷歌翻译